SETRED: Self-training with Editing

نویسندگان

  • Ming Li
  • Zhi-Hua Zhou
چکیده

Self-training is a semi-supervised learning algorithm in which a learner keeps on labeling unlabeled examples and retraining itself on an enlarged labeled training set. Since the self-training process may erroneously label some unlabeled examples, sometimes the learned hypothesis does not perform well. In this paper, a new algorithm named Setred is proposed, which utilizes a specific data editing method to identify and remove the mislabeled examples from the self-labeled data. In detail, in each iteration of the self-training process, the local cut edge weight statistic is used to help estimate whether a newly labeled example is reliable or not, and only the reliable self-labeled examples are used to enlarge the labeled training set. Experiments show that the introduction of data editing is beneficial, and the learned hypotheses of Setred outperform those learned by the standard self-training algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Significance of Peer-Editing in Teaching Writing to EFL Students

This study set out to investigate the effect of peer- editing as a metacognitive strategy on the development of writing. It was hypothesized that peer-editing could be used to raise grammatical and compositional awareness of the learners. Forty pre-intermediate sophomores at Islamic Azad University-Tabriz Branch participated in the study, taking the course Writing I. To warrant the initial homo...

متن کامل

Editing (Virayesh) as a Movement of Resistance During the Iran-Iraq War

The present study concerns editing of translations in Iran during the Iran-Iraq War,which in the official discourse of the country is known as the Sacred Defense. Itargues that editing, in its local sense, advocated a linguistic purism inspired by aredefined nationalism, which went hand in hand with identity politics andsnowballed into a movement of resistance.

متن کامل

Language Adaptation for Extending Post-Editing Estimates for Closely Related Languages

This paper presents an open-source toolkit for predicting human post-editing efforts for closely related languages. At the moment, training resources for the Quality Estimation task are available for very few language directions and domains. Available resources can be expanded on the assumption that MT errors and the amount of post-editing required to correct them are comparable across related ...

متن کامل

Semi-supervised multi-label image classification based on nearest neighbor editing

Semi-supervised multi-label classification has been applied to many real-world applications such as image classification, document classification and so on. In semi-supervised learning, unlabeled samples are added to the training set for enhancing the classification performance, however, noises are introduced simultaneously. In order to reduce this negative effect, the nearest neighbor data edi...

متن کامل

Enhanced Texture Editing using Self Similarity

Texture mapping is an indispensable tool for achieving realism in computer graphics. Significant progress has been made in recent years with regards to the synthesis and editing of 2D texture images. However, the exploration of user control for semi-automatic texture editing remains an open area of research. We present methods that partially address the semantic and technical limitations of Sel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005